SNPs Problems, Complexity, and Algorithms

نویسندگان

  • Giuseppe Lancia
  • Vineet Bafna
  • Sorin Istrail
  • Ross Lippert
  • Russell Schwartz
چکیده

Single nucleotide polymorphisms (SNPs) are the most frequent form of human genetic variation. They are of fundamental importance for a variety of applications including medical diagnostic and drug design. They also provide the highest–resolution genomic fingerprint for tracking disease genes. This paper is devoted to algorithmic problems related to computational SNPs validation based on genome assembly of diploid organisms. In diploid genomes, there are two copies of each chromosome. A description of the SNPs sequence information from one of the two chromosomes is called SNPs haplotype. The basic problem addressed here is the Haplotyping, i.e., given a set of SNPs prospects inferred from the assembly alignment of a genomic region of a chromosome, find the maximally consistent pair of SNPs haplotypes by removing data “errors” related to DNA sequencing errors, repeats, and paralogous recruitment. In this paper, we introduce several versions of the problem from a computational point of view. We show that the general SNPs Haplotyping Problem is NP–hard for mate–pairs assembly data, and design polynomial time algorithms for fragment assembly data. We give a network–flow based polynomial algorithm for the Minimum Fragment Removal Problem, and we show that the Minimum SNPs Removal problem amounts to finding the largest independent set in a weakly triangulated graph.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Approach to Solve N-Queen Problem with Parallel Genetic Algorithm

Over the past few decades great efforts were made to solve uncertain hybrid optimization problems. The n-Queen problem is one of such problems that many solutions have been proposed for. The traditional methods to solve this problem are exponential in terms of runtime and are not acceptable in terms of space and memory complexity. In this study, parallel genetic algorithms are proposed to solve...

متن کامل

Meta-heuristic Algorithms for an Integrated Production-Distribution Planning Problem in a Multi-Objective Supply Chain

In today's globalization, an effective integration of production and distribution plans into a unified framework is crucial for attaining competitive advantage. This paper addresses an integrated multi-product and multi-time period production/distribution planning problem for a two-echelon supply chain subject to the real-world variables and constraints. It is assumed that all transportations a...

متن کامل

Three Hybrid Metaheuristic Algorithms for Stochastic Flexible Flow Shop Scheduling Problem with Preventive Maintenance and Budget Constraint

Stochastic flexible flow shop scheduling problem (SFFSSP) is one the main focus of researchers due to the complexity arises from inherent uncertainties and also the difficulty of solving such NP-hard problems. Conventionally, in such problems each machine’s job process time may encounter uncertainty due to their relevant random behaviour. In order to examine such problems more realistically, fi...

متن کامل

Applications of two new algorithms of cuckoo optimization (CO) and forest optimization (FO) for solving single row facility layout problem (SRFLP)

Nowadays, due to inherent complexity of real optimization problems, it has always been a challenging issue to develop a solution algorithm to these problems. Single row facility layout problem (SRFLP) is a NP-hard problem of arranging a number of rectangular facilities with varying length on one side of a straight line with aim of minimizing the weighted sum of the distance between all facility...

متن کامل

Improved teaching–learning-based and JAYA optimization algorithms for solving flexible flow shop scheduling problems

Flexible flow shop (or a hybrid flow shop) scheduling problem is an extension of classical flow shop scheduling problem. In a simple flow shop configuration, a job having ‘g’ operations is performed on ‘g’ operation centres (stages) with each stage having only one machine. If any stage contains more than one machine for providing alternate processing facility, then the problem...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001